Measuring Word Relatedness Using Heterogeneous Vector Space Models

نویسندگان

  • Wen-tau Yih
  • Vahed Qazvinian
چکیده

Noticing that different information sources often provide complementary coverage of word sense and meaning, we propose a simple and yet effective strategy for measuring lexical semantics. Our model consists of a committee of vector space models built on a text corpus, Web search results and thesauruses, and measures the semantic word relatedness using the averaged cosine similarity scores. Despite its simplicity, our system correlates with human judgements better or similarly compared to existing methods on several benchmark datasets, including WordSim353.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Alternative measures of word relatedness in distributional semantics

This paper presents an alternative method to measuring word-word semantic relatedness in distributional semantics framework. The main idea is to represent target words as rankings of all co-occurring words in a text corpus, ordered by their tf – idf weight and use a metric between rankings (such as Jaro distance or Rank distance) to compute semantic relatedness. This method has several advantag...

متن کامل

Measuring author research relatedness: A comparison of word-based, topic-based, and author cocitation approaches

Relationships between authors based on characteristics of published literature have been studied for decades. Author cocitation analysis using mapping techniques has been most frequently used to study how closely two authors are thought to be in intellectual space based on how members of the research community co-cite their works. Other approaches exist to study author relatedness based more di...

متن کامل

Combining Word Representations for Measuring Word Relatedness and Similarity

Many unsupervised methods, such as Latent Semantic Analysis and Latent Dirichlet Allocation, have been proposed to automatically infer word representations in the form of a vector. By representing a word by a vector, one can exploit the power of vector algebra to solve many Natural Language Processing tasks e.g. by computing the cosine similarity between the corresponding word vectors the seman...

متن کامل

Measuring semantic relatedness with vector space models and random walks

Both vector space models and graph randomwalk models can be used to determine similarity between concepts. Noting that vectors can be regarded as local views of a graph, we directly compare vector space models and graph random walk models on standard tasks of predicting human similarity ratings, concept categorization, and semantic priming, varying the size of the dataset from which vector spac...

متن کامل

Representation Learning for Measuring Entity Relatedness with Rich Information

Incorporating multiple types of relational information from heterogeneous networks has been proved effective in data mining. Although Wikipedia is one of the most famous heterogeneous network, previous works of semantic analysis on Wikipedia are mostly limited on single type of relations. In this paper, we aim at incorporating multiple types of relations to measure the semantic relatedness betw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012